flowchart LR
A[("Raw reads")] --> B{"fastp"}
B --> C{"FastQC"} & E{"STAR"}
C --> D{"MultiQC"}
E --> F{"HTSeq"}
F --> G[("Count table")]
G --> I{"DESeq2"} & Z{"limma"} & W{"PCA"}
Z --> H{"PCA"} & J{"BioNERO"}
I --> K{"GSEA"}
J --> L{"ORA"} & N{"STRING"}
G --> V{"GSVA"}
A:::Aqua
B:::Aqua
B:::Pine
C:::Pine
E:::Pine
D:::Pine
F:::Pine
G:::Aqua
I:::Pine
Z:::Pine
W:::Pine
H:::Pine
J:::Pine
K:::Pine
L:::Pine
N:::Pine
V:::Pine
classDef Aqua stroke-width:1px, stroke-dasharray:none, stroke:#46EDC8, fill:#DEFFF8, color:#378E7A
classDef Pine stroke-width:1px, stroke-dasharray:none, stroke:#254336, fill:#27654A, color:#FFFFFF
Bulk RNA-seq analysis of mice model melanoma
Material and methods
RNA expraction and sequencing
Total RNA was isolated using the RNeasy Plant Mini Kit (QIAGEN, Germany) following the manufacturer’s protocol. To eliminate potential DNA contamination, the RNA was further treated with the TURBO DNA-free Kit (Thermo Fisher Scientific, USA) in a 50 µL reaction volume. Purification was performed using Agencourt RNA Clean XP beads (Beckman Coulter, USA). RNA concentration and integrity were evaluated using the Quantitative RiboGreen RNA Assay (Thermo Fisher Scientific) and the RNA 6000 Pico Kit (Agilent Technologies, USA). RNA libraries were constructed using the NEBNext Poly(A) mRNA Magnetic Isolation Module followed by the KAPA RNA Hyper Kit (Roche, Switzerland), adhering to the manufacturer’s instructions. Final library purification was carried out with KAPA HyperPure Beads (Roche, Switzerland). Library fragment size distribution and purity were assessed using the High Sensitivity DNA Kit (Agilent Technologies). Quantification was performed with the Quant-iT DNA Assay Kit, High Sensitivity (Thermo Fisher Scientific). Equimolar pools of each library (10 pM) were subjected to high-throughput sequencing on the Illumina HiSeq 2500 platform using paired-end reads (2 × 100 bp) with a 2% PhiX spike-in control.
Bioinformatics analysis
Initial quality assessment of murine transcriptomic data was performed using FastQC (v0.12.1), with aggregated reports generated via MultiQC (v1.27.1) [Ewels et al., 2016]. Raw reads were processed using fastp (v0.23.4) to remove low-quality sequences [Chen et al., 2018]. Read alignment was subsequently conducted with STAR (v2.7.11) [Dobin et al., 2013] against the Mus musculus reference genome (GENCODE GRCm39.vM36) using the annotation file gencode.vM36.chr_patch_hapl_scaff.annotation.gtf, followed by gene-level quantification with HTSeq-count (v2.0.5) [Anders et al., 2015]. Differential gene expression analysis between experimental groups was performed using DESeq2 (v1.44.0) [Love et al., 2014]. For comprehensive functional interpretation, we implemented gene-set enrichment analysis (GSEA) through fgsea (v1.31.6) [Korotkevich et al., 2016] and clusterProfiler (v4.12.6) [Wu et al., 2021], utilizing both the MSigDB [Castanza et al., 2023] and CellMarker 2.0 [Hu et al., 2023] databases. Immune cell enrichment patterns were further assessed via Gene Set Variation Analysis (GSVA) [Hänzelmann et al., 2013]. Immune cell populations were deconvoluted from tumor transcriptome profiles using the mMCP-counter algorithm [Petitprez et al., 2020]. Co-expression network analysis was performed using BioNERO (v1.12.0) [Almeida-Silva et al., 2022], employing signed hybrid network topology with Pearson correlation metrics. Potential batch effects were addressed using the removeBatchEffect function from the limma package [Ritchie et al., 2015]. Functional annotation of co-expressed gene modules included over-representation analysis against multiple pathway databases including MSigDB, KEGG [Kanehisa et al., 2025], Reactome [Milacic et al., 2024], and WikiPathways [Agrawal et al., 2024]. Protein-protein interaction networks were reconstructed using the STRING database v12.0 through the STRINGdb (v2.16.4) Bioconductor package [Szklarczyk et al., 2023]. Visualization of analytical results was achieved using a combination of pheatmap (v1.0.12) for heatmap generation, ggplot2 (v3.5.1) for general plotting, and EnhancedVolcano (v1.22.0) for specialized visualization of differential expression results. The entire analytical workflow was implemented in R (v4.4.2).
References
DNA extraction and sequencing
RNeasy Plant Mini Kit (QIAGEN, German)
TURBO DNA-free kit (Thermo Fisher Scientific, USA)
Agencourt RNA Clean XP (Beckman Coulter, Brea, USA)
Quantitative RiboGreen RNA assay (Thermo Fisher Scientific)
RNA 6000 Pico Kit (Agilent Technologies, Santa Clara, CA, USA)
NEBNext Poly(A) mRNA Magnetic Isolation Module
KAPA RNA Hyper Kit (Roche, Switzerland)
KAPPA HyperPure beads (Roche, Switzerland)
Highly Sensitive DNA Kit (Agilent Technologies)
Quant-iT DNA Assay Kit, High Sensitivity (Thermo Fisher Scientific)
Databases
GENCODE: https://www.gencodegenes.org/mouse/
MSigDB: https://www.gsea-msigdb.org/gsea/msigdb
KEGG: https://www.genome.jp/kegg/
Reactome: https://reactome.org/
Wiki pathways: https://www.wikipathways.org/
CellMarker 2.0: http://www.bio-bigdata.center/
STRING: https://string-db.org/
Tools
FastQC: https://github.com/s-andrews/FastQC
MultiQC: https://github.com/MultiQC/MultiQC
fastp: https://github.com/OpenGene/fastp
STAR: https://github.com/alexdobin/STAR
HTSeq: https://github.com/simon-anders/htseq
DESeq2: https://github.com/thelovelab/DESeq2
fgsea: https://github.com/alserglab/fgsea
clusterProfiler: https://guangchuangyu.github.io/software/clusterProfiler/
GSVA: https://bioconductor.org/packages/release/bioc/html/GSVA.html
mMCP-counter: https://github.com/cit-bioinfo/mMCP-counter
biomaRt: https://github.com/grimbough/biomaRt
BioNERO: https://github.com/almeidasilvaf/BioNERO
limma: https://bioconductor.org/packages/release/bioc/html/limma.html
STRINGdb: https://www.bioconductor.org/packages/release/bioc/html/STRINGdb.html
pheatmap: https://github.com/raivokolde/pheatmap
ggplot2: https://github.com/tidyverse/ggplot2
EnhancedVolcano: https://github.com/kevinblighe/EnhancedVolcano
Raw data
PRJNA1214537: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1214537/
Results
Experimental Design and Sample Overview
The analysis included a total of 14 transcriptomic samples representing three experimental groups: 1) melanoma (n=6); 2) melanoma supplemented with Bifidobacterium adolescentis 150 (n=6); and 3) melanoma supplemented with Lacticaseibacillus rhamnosus K32 (n=2) (Table 1). The study was conducted in two sequential replicates: the first replicate comprised only the melanoma and B. adolescentis-supplemented groups (n=8), while the second replicate incorporated all three experimental conditions (n=6). The genomes of the bacteria used in this experiment are available at the following links GCF_001010915.1 and GCF_000735255.1. For clarity in data interpretation, we introduced the following standardized codings:
Experimental groups:
Melanoma –>
MMelanoma with B. adolescentis supplement –>
M_BIFMelanoma with L. rhamnosus supplement –>
M_LAC
Experimental batch:
First experiment –>
batch_1Second experiment –>
batch_2
Quality control
Following quality filtering with fastp, aggregated quality metrics were compiled using MultiQC to summarize the FastQC reports. As illustrated in Figure 1, the MultiQC quality profile indicates that the sequencing data maintained adequate overall quality. Post-filtering, approximately 198 million reads (mean ± SD: 14 ± 5 mln reads per sample) were retained for downstream analysis, with an average of 16 ± 7% of reads per sample removed during quality control (see Table 2 for detailed filtering statistics). Transcript abundance was quantified using the HTSeq tool. Figure 3 displays the distribution of HTSeq counts across all experimental samples. While no statistically significant differences in read counts were observed between experimental groups, batch effects significantly influenced read distribution (Figure 3). The resulting gene expression count matrix was normalized using the median of ratios method, with normalized values presented in Table 4.
MiltiQC report
Reads filtering by quality statistic
Mapping reads to reference counts
Saving 5 x 7 in image

ANOVA
Df Sum Sq Mean Sq F value Pr(>F)
batch 1 851.8 851.8 210.230 4.84e-08 ***
group 2 0.1 0.0 0.006 0.994
Residuals 10 40.5 4.1
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Normalized counts table
Principal component analysis
Principal component analysis (PCA) was employed as a linear dimensional reduction technique for exploratory data analysis and visualization. Prior to PCA, the expression data underwent variance-stabilizing transformation (VST), which derives stable variance estimates from fitted dispersion-mean relationships. This transformation processes count data normalized by size factors, generating a matrix of approximately homeostatic values characterized by constant variance across the range of mean expression levels, while simultaneously accounting for library size differences. The transformed data were visualized through bi-dimensional projection of the first two principal components (Figure 4), with the proportion of explained variance for each component detailed in Figure 5. Permutation multivariate analysis of variance (PERMANOVA) revealed statistically significant associations between expression profiles and batch effects, while no significant relationships with experimental groups were detected.
Saving 5 x 4 in image

Saving 5 x 4 in image

Permutation test for adonis under reduced model
Marginal effects of terms
Permutation: free
Number of permutations: 9999
adonis2(formula = t(vsd) ~ batch + group, data = meta_data, permutations = 9999, method = "euclidean", by = "margin")
Df SumOfSqs R2 F Pr(>F)
batch 1 9421 0.19069 3.7386 0.0086 **
group 2 7784 0.15756 1.5445 0.1325
Residual 10 25199 0.51007
Total 13 49404 1.00000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Differential expression analysis
Differential expression analysis was conducted using the DESeq2 package, employing the model formula ~ batch + group to account for batch-related dispersion effects. The analysis revealed positive log2 fold change (lfc) values associated with both B. adolescentis 150 and L. rhamnosus K32 supplementation groups, while negative values corresponded to the untreated melanoma control group. DESeq2 performs differential expression analysis by fitting generalized linear models to negative binomial distributed counts and evaluating linear contrasts between experimental groups. Our results demonstrated 3 significantly up-regulated and 16 down-regulated genes in the M_BIF group compared to the M control group. For the M_LAC comparison, 21 differentially expressed genes were identified relative to the M group, while no significant changes were observed in the reverse comparison. When directly comparing bacterial supplementation groups, 39 genes were associated with M_BIF and 116 with M_LAC. Differentially expressed genes were determined using stringent thresholds (adjusted p-value < 0.05, |lfc| > 1). Complete statistical results of the differential analysis are presented in Tables 4-6. Visual representation of the results through volcano plots (Figures 6-8) effectively illustrates the relationship between statistical significance (-log10 p-value) and magnitude of expression changes (lfc) across all comparisons.
M_BIF vs M

M_LAC vs M

M_BIF vs M_LAC

Gene set enrichment analysis
Gene Set Enrichment Analysis (GSEA) was employed to evaluate whether predefined gene sets exhibited statistically significant and concordant differences between biological states (e.g., phenotypic groups). This computational approach was implemented using multiple databases, enabling comprehensive identification of biological functions associated with the experimental conditions. For this analysis, we specifically utilized the HALLMARK MSigDB gene sets. The complete GSEA results are presented in Figure 9 and Tables 7-9. To delineate cell-type-specific expression patterns, we performed marker analysis using the Mouse Cell Marker 2.0 database, with results visualized in Figure 10 and detailed in Tables 10-12. Complementary to this, Gene Set Variation Analysis (GSVA) was conducted to examine HALLMARK MSigDB gene sets in a sample-oriented manner (Figures 11-16). The investigation was extended to immune cell profiling through two complementary approaches: mMCP-counter was implemented for broad immune cell type quantification (Figure 17), whileGSVA was specifically applied to assess critical immune parameters including M1/M2 macrophage polarization ratios and CD8+ T cell abundance (Figures 18-21).
MSigDb Hallmark
M_BIF vs M
M_LAC vs M
M_BIF vs M_LAC

Cell Marker 2.0
M_BIF vs M
M_LAC vs M
M_BIF vs M_LAC

Specific HALLMARK pathways levels
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 85.33 42.67 4.839 0.0339 *
batch 1 54.00 54.00 6.125 0.0328 *
Residuals 10 88.17 8.82
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 100.33 50.17 5.595 0.0234 *
batch 1 37.50 37.50 4.182 0.0681 .
Residuals 10 89.67 8.97
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 74.67 37.33 2.742 0.112
batch 1 16.67 16.67 1.224 0.294
Residuals 10 136.17 13.62
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 74.67 37.33 3.107 0.0892 .
batch 1 32.67 32.67 2.718 0.1302
Residuals 10 120.17 12.02
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 129.00 64.50 12.141 0.00211 **
batch 1 45.38 45.38 8.541 0.01523 *
Residuals 10 53.13 5.31
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 106.33 53.17 5.472 0.0248 *
batch 1 24.00 24.00 2.470 0.1471
Residuals 10 97.17 9.72
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Immune cell types deconvolution
mMCP-counter
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 149.33 74.67 30.90 5.24e-05 ***
batch 1 54.00 54.00 22.34 0.000808 ***
Residuals 10 24.17 2.42
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
GSVA
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 80.67 40.33 4.211 0.0471 *
batch 1 51.04 51.04 5.328 0.0436 *
Residuals 10 95.79 9.58
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 57.33 28.667 1.688 0.233
batch 1 0.38 0.375 0.022 0.885
Residuals 10 169.79 16.979
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 47.33 23.67 2.219 0.1594
batch 1 73.50 73.50 6.891 0.0254 *
Residuals 10 106.67 10.67
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Saving 3 x 4 in image

Df Sum Sq Mean Sq F value Pr(>F)
group 2 37.33 18.67 1.6 0.2495
batch 1 73.50 73.50 6.3 0.0309 *
Residuals 10 116.67 11.67
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Co-expression analysis
Gene co-expression analysis was performed to identify functionally related gene clusters and investigate their potential associations with phenotypic traits. Prior to analysis, technical batch effects were removed from the expression matrix using the removeBatchEffect function implemented in the limma package. PCA of the batch-corrected data showed similar sample clustering patterns to the uncorrected dataset (Figures 22 and 23), with PERMANOVA confirming the effective removal of batch effects while revealing no significant associations with experimental groups. The BioNERO package identified several co-expression modules showing differential associations with experimental conditions. Notably, the greenyellow and sienna3 modules demonstrated positive correlations with the untreated melanoma group, while the pink and royalblue modules showed negative associations (Figures 24-27, Tables 13 and 14). Functional characterization through over-representation analysis revealed significant biological pathways enriched in these modules (Tables 15 and 16). Hub gene analysis identified key regulatory elements within each module (Tables 17 and 18), with protein-protein interaction networks visualized for selected modules (Figure 28). Additional network characterization using STRING database highlighted potential functional relationships among module genes (Figures 29 and 30).
Batch correction
Saving 5 x 4 in image

Saving 5 x 4 in image

Permutation test for adonis under reduced model
Marginal effects of terms
Permutation: free
Number of permutations: 9999
adonis2(formula = t(mat) ~ batch + group, data = meta_data, permutations = 9999, by = "margin")
Df SumOfSqs R2 F Pr(>F)
batch 1 0.0001345 0.03397 0.4480 0.9468
group 2 0.0009222 0.23285 1.5355 0.1214
Residual 10 0.0030029 0.75823
Total 13 0.0039604 1.00000
Net construction
Saving 8 x 5 in image

Saving 8 x 6 in image

Co-expressed modules linked to M group

Saving 6 x 12 in image

Over-representation analysis
Hub genes analysis and visualization

STRING analysis of modules

